Using Cost-Sensitive Learning to Determine Gene Conversions
نویسندگان
چکیده
Gene conversion, a non-reciprocal transfer of genetic information from one sequence to another, is a biological process whose importance in affecting both short-term and long-term evolution cannot be overemphasized. Knowing where gene conversion has occurred gives us important insights into gene duplication and evolution in general. In this paper we present an ensemble-based learning method for predicting gene conversions using two different models of reticulate evolution. Since detecting gene conversion is a rare-class problem, we implement costsensitive learning in the form of a generated cost matrix that is used to modify various underlying classifiers. Results show that our method combines the predictive power of different models and is able to predict gene conversion more accurately than any of the two studied models. Our work provides a useful framwork for future improvement of gene conversion predictions through multiple models of gene conversion.
منابع مشابه
Optimizing a Cost Matrix to Solve Rare-Class Biological Problems
In a binary dataset, a rare-class problem occurs when one class of data (typically the class of interest) is far outweighed by the other. Such a problem is typically difficult to learn and classify and is quite common, especially among biological problems such as the identification of gene conversions. A multitude of solutions for this problem exist with varying levels of success. In this paper...
متن کاملA New Formulation for Cost-Sensitive Two Group Support Vector Machine with Multiple Error Rate
Support vector machine (SVM) is a popular classification technique which classifies data using a max-margin separator hyperplane. The normal vector and bias of the mentioned hyperplane is determined by solving a quadratic model implies that SVM training confronts by an optimization problem. Among of the extensions of SVM, cost-sensitive scheme refers to a model with multiple costs which conside...
متن کاملProposing a Novel Cost Sensitive Imbalanced Classification Method based on Hybrid of New Fuzzy Cost Assigning Approaches, Fuzzy Clustering and Evolutionary Algorithms
In this paper, a new hybrid methodology is introduced to design a cost-sensitive fuzzy rule-based classification system. A novel cost metric is proposed based on the combination of three different concepts: Entropy, Gini index and DKM criterion. In order to calculate the effective cost of patterns, a hybrid of fuzzy c-means clustering and particle swarm optimization algorithm is utilized. This ...
متن کاملCredit Card Fraud Detection using Data mining and Statistical Methods
Due to today’s advancement in technology and businesses, fraud detection has become a critical component of financial transactions. Considering vast amounts of data in large datasets, it becomes more difficult to detect fraud transactions manually. In this research, we propose a combined method using both data mining and statistical tasks, utilizing feature selection, resampling and cost-...
متن کاملThe Comparison of the Effectiveness of a Modified Conformation Sensitive Gel Electrophoresis with Denaturing High Performance Liquid Chromatography
Background: Several methods have been developed for detection of sequence variation in genes and each has its advantages and disadvantages. A disadvantage of them is that the simpler, cost-effective methods are commonly perceived as being less sensitive in their detection of sequence variation, whereas those with proven sensitivity have a requirement for complex or expensive laboratory equipmen...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2008